Fast Statistical Grammar Induction
نویسندگان
چکیده
The statistical induction of context free grammars from bracketed corpora with the Inside Outside Algorithm has often inspired researchers, but the computational complexity has made it impossible to generate a large scale grammar. The method we suggest achieves the same results as earlier research, but at a much smaller expense in computer time. We explain the modifications needed to the algorithm, give results of experiments and compare these to results reported in other literature.
منابع مشابه
Induction of Greedy Controllers for Deterministic Treebank Parsers
Most statistical parsers have used the grammar induction approach, in which a stochastic grammar is induced from a treebank. An alternative approach is to induce a controller for a given parsing automaton. Such controllers may be stochastic; here, we focus on greedy controllers, which result in deterministic parsers. We use decision trees to learn the controllers. The resulting parsers are surp...
متن کاملGrammar Induction by Distributional Clustering with the Fragment Constituency Criterion
This paper proposes that the identification of constituents, which is the core problem in grammar induction, can be accomplished by a simple constituency criterion in linguistics: a word/tag sequence which can occur as a fragment is a constituent. Experiment results show that grammar induction by distributional clustering augmented with this criterion achieves good PARSEVAL scores and improves ...
متن کاملیک مدل بیزی برای استخراج باناظر گرامر زبان طبیعی
In this paper, we show that the problem of grammar induction could be modeled as a combination of several model selection problems. We use the infinite generalization of a Bayesian model of cognition to solve each model selection problem in our grammar induction model. This Bayesian model is capable of solving model selection problems, consistent with human cognition. We also show that using th...
متن کاملIntroduction to the Special Topic on Grammar Induction, Representation of Language and Language Learning
Grammar induction refers to the process of learning grammars and languages from data; this finds a variety of applications in syntactic pattern recognition, the modeling of natural language acquisition, data mining and machine translation. This special topic contains several papers presenting some of recent developments in the area of grammar induction and language learning, as applied to vario...
متن کاملSynchronous Constituent Context Model for Inducing Bilingual Synchronous Structures
Traditional Statistical Machine Translation (SMT) systems heuristically extract synchronous structures from word alignments, while synchronous grammar induction provides better solutions that can discard heuristic method and directly obtain statistically sound bilingual synchronous structures. This paper proposes Synchronous Constituent Context Model (SCCM) for synchronous grammar induction. Th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996